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A LIE-GROUP APPROACH TO RIGID IMAGE REGISTRATION 

MARTIN SCHROTER*, UWE HELMKE*, AND OTTO SAUERt 

Abstract. The task of imago rcstration is to find the spatial correspondence of two or more 
given images. In this paper we assume that the correspondence is given either by an Euclidean, or 
by an affine volume-preserving transformation. Since the registration problem can be seen as an 
optimization problem on a finite dimensional Lie group, we use a recently developed framework of 
approximate-Newton methods on manifolds, which leads to locally quadratically convergent algo- 
rithms. To reduce numerical costs, we present two strategies: One makes use of the quasi Monte 

^"^ ' Carlo Method and the other ends up with an algorithm acting on spline function spaces. An extension 

-^1 ' for multi-modal image registration is given as well. 

Key words. Image Registration, Newton-like Optimization, B-splines, Quasi Monte Carlo 
Methods 
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p^ ■ 1. Introduction. In this paper we study the task of rigid image registration as 

an optimization problem on a Lie group. Although standard formulations of the prob- 
lem focus on two- or three-dimensional images, the subsequent mathematical analysis 
goes through in any dimension. Any gray-scale image is thus identified with its associ- 
ated intensity function / : M" — > M, which we assume to be a three times continuously 
r^ [ differentiable function with compact support. The rigid image registration task for 

two such functions /, g then amounts to find the Euclidean transformation (A, t) that 
minimizes the L^-distance 

I {f{Ax + t)-g[x)fdx. (1.1) 

> 
^— V ■ Since the set of rigid body transformations pA,t ■ K" -^ M",a; i-^ Ax + t, forms 

\^ I a Lie group, the Euclidean transformation group SE{n), we obtain a least squares 

optimization on SE{n), which is solved here using appropriate- Newton algorithms. 
'^ I Image registration is a fundamental task in image processing, with applications in 

f~>-. ■ various fields, including e.g. robotics [7] and geophysics [30]. For medical image ap- 

^D I plications, registration is used for image-based treatment planning and image-guided 

^^ ■ treatment delivery, see e.g. [3], [27], [14] and the references therein. Although an over- 

whelming number of publications focuses on non-rigid registration, where the task is 
to find a non-linear diffeomorphism, rigid registration is still of considerable interest 
and may lead to good starting point for a subsequent non-rigid registration phase; 
see e.g. [34], [25], [26], [37], and [17]. Often, as in e.g. [23], [32], rigid image regis- 
C^ I tration algorithms are designed, that employ fixed local coordinates of the Euclidean 

group via Euler-angles and then apply standard optimization algorithms on the affine 
parameter space. This simple local coordinate chart approach however has its draw- 
backs and may lead to ill-conditioned algorithms at the boundary of the parameter 
space. For example, in [16] it is observed that the singularities inherent in local Eu- 
ler angle coordinates may reduce the speed of convergence, compared to algorithms 
acting on the Lie group. Therefore Lee et al propose in [16] a linearly convergent 
Nelder-Mead algorithm on the Lie group SE{n) for multi-modal image registration. 
In order to achieve faster, local quadratic convergence rates we introduce a new type 
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of approximatc-Ncwton methods on SE{n) that avoids singularities of Euler-angle 
coordinates. 

Of course, the task of studying Newton's method on Lie groups is not new and has 
been already applied e.g. to robotics and computer vision problems; see e.g Park [5], 
Hiiper et al [8] and Sastry [20] for background material. However, such prior work 
suffers from a number of shortcomings that limit the applicability to image registration 
problems. The Riemannian Gauss-Newton method [1] performs a Newton step via a 
line search along a geodesic. For the Euclidean rotation group SO{n), such geodesies 
require the computation of the matrix exponential of a skew-symmetric matrix; which 
may be a formidable numerical task in high dimensions. Moreover, since the Euclidean 
group SE{n) carries no natural bi-invariant Riemannian metric, such geodesies are 
described by solutions to nonlinear second order differential equations which are hard 
to compute. Even in the case of non-rigid registration, the Lie group structure of 
the set of diffeomorphisms get a key position in the performance of the algorithms. 
In [18] it is mentioned that incorporating geodesies leads to an increasing of the 
accuracy. However, calculating the geodesies, which is equivalent to solve a time- 
varying ODE, is a time consuming procedure and several approaches arc known in 
literature to avoid this step: For example, in [38] vector fields arc used which fulfill the 
momentum conservation equation, in [35] the authors uses one-parameter subgroups 
to approximate the geodesies and propose a fast method for calculating the vector 
field exponential. We refer to [19] or [31] and the reference therein for a further study 
of the task of non-parametric image registration. 

In this paper, we present a new class of approximate Newton algorithms that 
are taylored to the least squares optimization problem (1.1) and avoid the above 
mentioned difficulties. Following earlier work on Newton's method on manifolds by 
Helmke and Moore [10], Shub [29], Manton [22], Hiiper and Trumpf [13], Helmke, 
Hiiper and Trumpf [9] and also Absil [1], in this paper, we use very simple local 
parameterizations of the Euclidean transformation group to compute an approximated 
version of the Hessian and to perform the Newton-step. These previous works mainly 
deal with minimizing trace-functions on SO{n) or on its homogeneous spaces. In 
comparison, we study the action of the Euclidean transformation Group on an infinite 
dimensional vector space and give an extension to the Special Afiine Group SA{n). 
We will show local quadratic convergence of the algorithms under suitable genericity 
conditions, which also arise in the classical theory of Newton methods on vector 
spaces. Our algorithm seems to be new even for minimizing standard trace functions. 
We are not awcar of similar algorithms on SA{n). 

A bottleneck in implementing such algorithms lies in the difficulty of effectively 
evaluating the higher dimensional integrals, which may suffer from the curse of di- 
mensionality. In this paper we will present and discuss two different strategies to 
circumvent this problem. First, we use Quasi Monte-Carlo methods to approximate 
the integrals by taking samplings of the functions at suitable random points. This 
leads to an easily implcmentable algorithm in the case of image registration. Sec- 
ond, we used i3-spline approximations of the images to get an approximation of the 
integral. The second method actually works better in practice as will be shown by 
examining concrete medical imaging tasks. Extensions to rigid registrations using 
volume preserving transformations are given, too. 

2. Local Parameterizations. To construct a Newton-like algorithm on a Lie 
Group, we use local parameterizations and coordinate charts. The difference to pre- 
vious work as e.g. [23], [32] is that we do not use fixed local coordinates, as our 



A Lie-Group Approach to Rigid Image Registration 3 

parameterizations change with the iteration points. This offers considerable advan- 
tages in designing the algorithm. Recall, that a local parameterization on an n- 
dimensional manifold M is a family {fip} ^^^ of smooth maps fip : R" —?' M that 
satisfies /ip(0) = p, p € M, and defines a local diffeomorphism around 0. Such local 
parameterizations exist on every manifold and provide coordinate charts around each 
point of the manifold. In contrast to usual coordinate charts, local parameterizations 
vary with each point of the manifold; a trivial fact, that actually helps to simplify 
the construction of numerical algorithms considerably. On a Riemannian manifold, a 
standard set of local parameterizations are given by the so-called Riemannian normal 
coordinates that are defined via the Riemannian exponential map. In the sequel, our 
local parameterizations have the advantage of being more easily computable than the 
exponential map, although they may not allow such immediate Riemannian geometry 
interpretations. 

2.1. Parameterization of Volume Preserving Transformations. For nec- 
essary background on Lie groups and Lie algebras we refer to [11]. Any affine trans- 
formation of R" is of the form x i— >■ Ax + t for a given transformation matrix A S M"^" 
and a translation vector t £ R". Let SL{n) denotes the special linear group of matrices 
A G R"^" with determinant 1. Its Lie algebra then is sl(n), the set of n x n- matrices 
with trace zero. The group of affine, volume preserving transformations than can be 
identified with the Lie group of all (n + 1) x (n -I- l)-matrices of the form 

which define the "special affine group" SA{n). By identifying a vector x G R" with 
its homogenous coordinates 

X 

1 

the affine transformation pM '■ x i-> Ax -f- 1 then becomes the linear map 

,r- f A t\ _ 
X i-^ MX ~ \ _ \ X. 

Using standard terminology from group theory, this just says that SA{ri) is the 
semidirect product SL{n) k R". The associated Lie algebra sa{n) of SA{n) consists 
of all matrices of the form 

I ) <^-^) 

with the condition that tr 17 = holds. Local parameterizations of SA{n) are then 
given as 

PM ■■ sl{n) X R" ^ SA(n) 

n v\ (2.3) 



Pm{^,v) := Af exp ^ 

In order to construct a computationally more feasible local parameterization, we 
consider the first order approximation of this map. Thus we decompose each Lie 
algebra element of s/(n) as 

X = Xi + Xd + Xu 
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where X^i is a diagonal matrix and X„,X/ arc strictly upper and lower triangular 
matrices, respectively. Let Aq denote the Q-factor in the Qi?-decomposition of a 
matrix A. One can easily show that 

e : sl{n) ^ SL{n), X ^ (I + Xi - X^)q [exp(Xd) + X, + Xj] 

is a first order approximation of exp \si(n)- This leads to the system of local parame- 
terization for SA{n) 

v'l^ ■ sl(n) X K" ^ SA{n) 

An important property of this parameterization which will prove useful in the 
sequel is that the derivative of i^^^ at the origin is the identity map. 

2.2. Parameterization of the Euclidean group. Every rigid body transfor- 
mation of R" can be decomposed in a rotation around the origin and a translation 
X I— > Ax + t. Thus the Euclidean group parameterizes all affine transformations 
PA,t{x) — Ax + t, where A £ SO{n) is a rotation matrix. Here SO{n) denotes the 
compact Lie group oinxn real matrices A satisfying AA^ = A^ A = I and det A = l. 
Using homogenous coordinates, this "Euclidean transformation group" becomes iden- 
tified with the subgroup of 5*^(71), consisting of all (n -I- 1) x (n + 1) matrices of the 
form (2.1) with A G SO{n). Similarly, the associated Lie Algebra se{n) consists of all 
matrices of the form (2.2) in which v S M" and fJ is a skew-symmetric n x n matrix. 

The matrix exponential map exp : se{n) — > SE{n) then provides us with a canon- 
ical map between the Lie algebra and the Lie group. This leads to the local parame- 
terization around any M € SE{n) of the form (2.1) as 

fiM ■■ so(ri) X R" -^ SE{n) 

n v\ (2.5) 



^A/(^,w) := A/exp J Q p 

Note, that the computation of the matrix exponential is expensive for large scale 
matrices but in the special cases n = 2, 3 explicit formulas such as that by Rodriguez 
(cf. e.g. [20] p. 27) are available. Nevertheless, such explicit formulas are not very use- 
ful in an optimization algorithm and the simplified formulas we use is more numerical 
efficient even in the low dimensional case. In the sequel, we approximate exp fl by the 
the orthogonal part of the QR-factorization oi I + ft. Note, that / -I- J7 is invertible 
for every skew-symmetric matrix 51, thus, {I + n)Q is always well defined. This leads 
the local parameterization of the SE{n) as: 

v^.^ : so{n) X M" -^ SE{n) 

ii + n)Q ii + ^n)v\ (2.6) 



4^(^,«):=m((^+^^)^ (^+/^ 



Note, that this map coincides with the previous map for SA{n), when restricted 
to 80(71) X R". In particular, Dv^j (0) = id holds, i.e. v^j is locally diffeomorphic 
around and a valid first order approximation of the exponential map. 
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3. Quasi-Newton Method. In this section, we propose a novel approximate- 
Newton algorithm for image registration that is based on the above local parameteri- 
zations. Our construction differs essentially from the well-known Riemannian Newton 
algorithm, which is based on knowledge of the geodesies to calculate the Hessian. Since 
the geodesies in SA{ti) are available only implicitly via the solutions of complicated 
nonlinear second-order differential differential equation, we prefer to avoid the Rie- 
mannian Newton method. Instead, we adapt a version of the approximate-Newton 
method as developed by Shub [29] and Hiiper and Trumpf [13], which has been already 
successfully applied for several optimization problems (see e.g. [9]). 

Let {/^Af IjvfgG be a set of local parameterizations of a Lie Group G. Thus, fiM 
is defined on an open neighborhood U C M" oi G U such that /ijv/ : [/ ^- G is 
diffeomorphic with fJ.M{0) = M. Additionally, we assume that fi{M,x) := ^m{x) is a 
smooth map. Let {i^m} MeG ^"^ another set of local parameterizations, subject to the 
same conditions. The (/x, v) Newton-iteration on G for a smooth objective function 
$ : G ^- R then is defined as 

Affc+i = UM, (' (hcss^o^,,^ (0)) ' V($ o /iMj (0) j , Mo e G, (3.1) 

where V/i(0) and Hess;i(0) denote the standard gradient and Hesse-operator of a 
smooth function h : M" -^ R, respectively. The iteration in (3.1) can be decomposed 
into the calculation of the Newton-step d = — (Hess^o^jj (0))~^V($ o /ijifj^)(0) and 
subsequent application of the local parameterization Af^+i = ^{^[^{d). The local 
parameterizations {^^^[} mgg ''^^^ used in (3.1) to calculate a classical Newton-step 
in Euclidean coordinates of R". The second parameterization {^m} j^^^q acts as a 
retraction of the tangent space onto the manifold, to carry out the actual Newton 
step. A practical choice of /i would be via the Riemannian exponential map, while 
for the retraction v any first order approximation of the exponential map would be 
sufficient. Local quadratic convergence of this method has been recently established; 
see [13], [9]. 

In the case of the image-registration problem, G is cither the Euclidean transfor- 
mation group SE{n) or the Special Affine Group SA{n). Moreover, {hm} is chosen 
as the exponential map, while (2.6), (2.4) are chosen for the retraction map {vm} j^^^q 
in SA{n) and SE(n), respectively. In this combination, neither the Riemannian nor 
the matrix exponential map have to be evaluated in a point different from zero. In 
order to reduce the numerical costs, we use the QR-factorizations in (2.6) and (2.4) 
for the update part of the iteration. The cost function to maximize on the Euclidean 
motion groups G = SA{n), SE(n) is $ : G -> R 

^{A,t)-~ I f{Ax + t)g{x)dx. (3.2) 

By invariance of the integral under volume preserving maps, this function differs from 
the least squares index (1.1) by a constant. In particularly, maximization of $ is 
equivalent to minimization of (l-l). 

3.1. Calculation of Gradient and Hessian. The aim of this section is to 
compute the Newton-iteration (3.1) for the objective function (3.2). In each iteration- 
step, we have to consider the function $ o fij^j and its first and second derivatives. 

Lemma 3.1. Let G denote either the Euclidean and affine Lie group SE{n) = 
SO{n) K R", SA{n) ~ SL{n) k R", respectively. Let fim denote the local parame- 
terizations (2.3), (2.5) and let $ denote the objective function (3.2). Endow the Lie 
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algebras q — sa{n),se(n) with their standard Euclidean inner product. For a fixed 
M e G, we get: 



(a) The gradient of ^ o fi in is V(<f> o ^m){0, 0) ~ (i^, v) with 

^ = / gM{z)Trt{\7f{z)z'^)dx V = gM{z)V f{z)dz. 



(3.3) 



Here gM '■— g ° Pm-^ and nt denotes the projection from gl{n) to the Lie 
algebra t ~ so(n) for G ~ SE{n) and t ~ sl{n) for G = SA{n), respectively: 

T^so{n){X) '■= -^{X - X ) ■Ksi^„^{X) := X /„. 

(b) The Hessian operator Hess^op,M (0) : — > o/ $ o fi^j at a critical point 
M e G is Hess^oi^MWi^:^) = i^:^) with 

ri^TTt ( if7^ fvf{z)z'^gM{z)dz + ^ jvf{z)z^gM{z)dzVL^ 



+ i Hf{z)Vlzz"^ gM{z)dz + i Hf{z)vz^ gM{z)dz 



v~ I Hf{z)gMiz){flz + v)dz 



(3.4) 



where Hf denotes the matrix representation of the Hessian of f . 
Proof. We calculate the directional derivative of $ o /i^/ as: 

— $o^M(Tr2,Tu) = \/f{Pexp{rno)Mx)^ Pnoexp{Tno)Mxg{x)dx (3.5) 



with fin = ( Q ) , ^/ = ( ) and P = (/„ 0) e K"x("+i) 



d 
d^ 

dT^ 



T=0 



$ o ^M(Tfi, Tv) = / Vf{Ax + t)^ {Q{Ax + t)+v) g{x)dx 



(3.6) 



K" 



r=0 



$ o ^Aiirn, Tv) = / {^{Ax + t) + v) H/(At + t) {n{Ax + t) + v) g{x)dx 



+ / V/(Ax + t)^ {VL^{Ax + t) + fiw) g{x)dx (3.7) 



After substituting z — Ax + t we get 



d_ 



^ ° A*M (t^, tv) = tr 



zVf{z)^gMiz)dzn 



+ { / V/(z)gM(z)rfz,t; 



Since the gradient {^,v) is the unique vector of the tangent space with 
d 



dT 



T~ 



$ o ijlm{tVL, tw) = tr (fi ' fi) + u ' z; 
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we have proved (3.3). 

To calculate the Hessian of $ o /ij\/ in zero, we again substitude z = Ax + i in 
formula (3.7) and get 

(ft r 

rfr2 



<f>o/ijv/(Tr2,ru) = / {VLz + v) \lf{z){9.z + v)gM{z)dz 



+ / Vf{zyn''zgM{z)dz+ / Vf{z)'^gM{z)dznv. 



Since the last summand is equal to v^v it vanishes in a critical point. Therefore, we 
optain the Hessian % by polarizing the two first sunimands 



•H<e.o,.m(o)(^, w)(^, «) = / {^z + v) H/(z)(f]z + v)gM{z)dz 

zVf{zYg{A'^{z-t))dz I {nvt + hvi) 



1 

-tr 
2 



which proves (3.4). D 

Note, that (3.4) yields the Hessian of $ o /xjv/ in only at a critical point M G G. 
In the sequel, we will use the same formula at an arbitrary point AI E G and thus 
obtain a modified Newton algorithm for <J>. Thus, the modified Newton-step in (3.1) 
requires to solve the following system of linear equations: 

Uf{z)gM{z){nz + v)dz = - [ gMiz)V.f{z)dz (3.8) 



and 



TTt i^n^ I V f{z)z'^ gM{z)dz + ^ j V f{z)z^ gM{z)dzVL' 



+ / B.fiz)nzz ' gM{z)dz + / llf{z)vz ' gM{z)dz) (3.9) 

ffM(2)7r{(V/(z)z^)dz 



with the unknowns w G M and SI G g. 

In order to rewrite (3.8) and (3.9) in a linear equation in the components Vi and 
r^ij", we first focus on the Euclidean transformation group. Here, with J7 g so{n) we 
obtain: 

''Y{f{z)gM{z){nz + v)dz = - f gM{z)\/f{z)dz (3.10) 



and 
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i / {-nVf{z)z'' - zS/f{z)^n) gM{z)dz 



\ I {-Vf{z)z'^n-nzVf{z)'^)gM{z)dz 



+ J {}if{z)vz'^ - zv'^Rfiz)) gM{z)dz (3.11) 

+ [ (H/(z)OzzT + zz'^nUfiz)) gM{z)dz 

gM{z){Vf{z)z'^ -zVf{zY)dz. 



To evaluate the components of this system of linear equations, we use the abbre- 
viations: 

f df f df 

ckj = 9{x)-:^(Ax + t)dx, Pi i = 9{x)(Ax + t),— — (Ax + t)dx, 

J oxi J dxi 



li,3,k = / g{x){Ax + t)i- — - — {Ax + t)dx, u^j = / g{x)-^—^ — {Ax + t)dx, 



(3.12) 



Si,j.k,i^ I g{x){Ax + t),{Ax + t)j-^-^{Ax + t)dx 



We obtain: 

Lemma 3.2. Let (^2, u) € so{n) x M" be the modified Newton- direction for the 
objective function (3.2) in a certain point M € SE{n). Then the components flk,^ 
1 ^ k,l ^ n of fl and v^ I ^ k ^ n of v satisfy 

'^{li,k,i - lk,i,i)^k,i + ^ Ei.fcWfc = -ai (3.13) 

k>l k 

for 1 ^ i ^ 71 and 



k>j k<j kyi 

9 ^il^3,k + Pk,])^i,k - ^(^i.fc.ij - 5j^i^k,i + ^i,l,k,j - SiMA,j)^k,l (3.14) 



2 

k<i k>l 



for 1 ^ z < j ^ 71. 

Note that the unknowns of this system are Vi and ili.j for i > j. Therefore, a 
unique solution of the linear system corresponds to an unique element of the so{n). 
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Let US return to the case of volume-preserving transformations G — SA{n). The 
modified Newton-equation (3.8) and (3.9) has now the form: 

Hess/(z)5A/(z)(fi2; + v)dz = - / gM{z)Vf{z)dz (3.15) 



and 



ir!^ J Vf[z)z'^gM{z)dz + i y" Vf{z)z'^gM{z)dzn' 

R" R" 

-I- / Y{esSf{z)Vtzz"^ gM{z)dz + i Hess f {z)v z"^ gM{z)dz 

R" K" 

1 I 9m{z) {zJn'^\/f{z) + z'^nRcssf{z)z + z^Hess/(z)u) dz (3.16) 

R" 

= - f gMiz)\/fiz)z'^dz + i/ f gM{z)z'^\/f{z)dz. 

R" R" 

Again, we can calculate the components of this system using the coefficients in (3.12). 
We end up with an analog version of lemma 3.2 in the case of a volume preserving 
transformations. 

Leimivia 3.3. Let (fi, u) G sl{n) x K" be the modified Newton- direction for the 
objective function (3.2) in a certain point M € SA{n). Then the components flk.i^ 
1 SC fc, / ^ n, (fc, /) 7^ {n, n) of Vt and Vk of v satisfy for each 1 ^ i ^ n 

22 ll,k,i^kJ + 2J (7fc,fc,i - ln,ns)^k,k + ^J ^i^k^k = -O-i (3.17) 

k^l k^n k 

and for all 1 ^ i, j ^ n, {i,j) ^ in,n) the following equations: 



■" fc k {kj)^{n,n) 

-5j,n,n,i 2_^ ^k,k + 2_^l],k.iVk = ~Pi. 



(3.18) 



for i ^ n, j ^n, i^ j, 



-Z 2_^ (3n,k^j.k + ^ 2^ Pk,j^k,n + ^^ Sjj^k,n^ 

k k^n {k,l)^{7i,n) 

-{S],n,n,i + ^/3nj) ^Z ^'^^'^ + 5Z ^J,k,nVk = "A 



(3.19) 



ky^n 

for i = n, j ^ n, 



-Z 2_^ Pi,k^n,k + Z 2_^ Pk,7i^k,i + 2^ 5n,l,k,i^k,l 

k^n k {k,l)^{n,n) 

-{5n,n,n,t + ^A.n) ^J ^k.k + Jj 7n,fe,iWfe = -f3i,n 



(3.20) 



2 

k^n 
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for j — 71, i ^ n and 



^ ^ 5l,m,k,m 1 1 ^k,l 



fc k {k,l)^{n,n)\ \ m 

\ m / / fc^tn k \ I 



A: 



(3.21) 

fori^j, i,j ^ n. 

We want to point out that the presented Newton step uses an approximated 
version of the Hessian. However, at each critical point of the cost function we have 
an exact evaluation of the Hessian. Approximations of the Hessian appear also in a 
couple of classical algorithms like the Gauss- Newton method, the Levenberg Marquard 
algorithm (see e.g. [1] for an overview) or the optimization technique presented in 
[21]. In contrast to our method, such algorithms do not perform a Newton step at a 
critical point and are not necessarily local quadratic convergent. 

3.2. The Quasi-Monte-Carlo Newton Algorithm. Even though the previ- 
ous subsection defines itself an iterative algorithm, a few specifications have to be 
made in order to apply it to the image registration task. 

First, we note that the most time-consuming part of the algorithms is the calcu- 
lation of the integrals appearing in the coefficients a, . . . , e in (3.12) - if they can be 
calculated at all. One way out is to approximate the integrals via Quasi Monte-Carlo 
methods. (See for example [15] for an introduction.) Thus, we replace in (3.12) the 
integrals by the average of sampled function- values via 



QCR" 

Explicitly, we use 



1 ^ 
J /(x)dx«-5]/(x,0. (3.22) 



1 ^ df 1 ^ Of 



Ji 



Of , . . ^ 1 v-^ , , Of 

r— 1 r— 1 

l..,k = J^Y.9M{AXr+t),g^^^JAXr+tl ,,,,.. _^g(^,)^_^-^(Ax.+i), 
r— 1 -^ r— 1 -^ 

1 ^ aV ^'-''^ 

Sij^kd = -j^Y 9{Xr){AXr +t)i{AXr +t)j-^ —{Axr +t). 

r—1 

The final registration algorithms for G = SE(ri) and G = SA(ri) are summarized in 
the Tables 3.1 and 3.2. 

In contrast to a classical Montc-Carlo method, the sampling-points xi G Q are 
chosen to be uniform distributed in Q. The so-called Halton sequence [6] is a good 



A Lie-Group Approach to Rigid Image Registration 11 



Table 3.1: QMC-Newton Registration- Algorithm 


on SE(n) 






Step 1. 

Pick an initial guess Mq G SE{7^) and set m = 0. 










Step 2. 

Calculate a;, fiij, 7i,j,fe, 5ij^k,i and etj for all 1 ^ z 

tion (3.23). 


j,k, 


i < 71 as defined in 


equa- 


Step 3. 

Solve the linear system consisting of the equations 










'^{'yi,k,i - 'yk,i,i)^k,i + ^ i^iM'k = 

k>l k 


-Oii 


for all 1 5? 


i ^ n 




and 










k>j k<j 




^(ft,fc+/3fe, 


,)^k,^ 




k<i k>l 


+ 5. 


l,k,j — &i,k,l. 


,)^k,l 




k 


for all 1 ^ i < 


j ^ n 




with the unknowns Vi and ilij, j > i. 










Step 4. 

Construct the n x n matrix f2 with the entries flij . In the case of j > i use the solution 


of step 3. Else, set 










r -fij, , for j < i 
"''^^\ forj=i 










Compute 










M^+,:=u^jl(n,v), 










where v'^^ is defined in (2.6). 










Step 5. 

Set m = m + 1 and goto Step 2. 











example of such uniformly distributed sampling points. It is shown in the literature, 
that for such well chosen sampling points, the approximation error of (3.22) is bounded 
by 0{{logN)'^ /N). (In contrast, the error of the Monte Carlo method tends to zero 
with the order 0{l/vN).) The final algorithms are presented in Table 3.1 on SE[n) 
and in Table 3.2 on SA{n) respectively. 

Before stating our main convergence result, a few remarks are in order. We use 
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standard Gauss elimination to solve the linear system in step 3 of each algorithm. If 
this system is not solvable, a standard approach in optimization theory is to search 
for the least squares solution. Moreover, as is the case for all Newton methods, 
convergence of the algorithm is not guaranteed for an arbitrary initial conditions 
Mq G G. Even if the algorithm converges, the limiting point need not be a local 
maximum. To overcome this, one can adapt a Gauss-Newton step, or first test if 
the Newton-direction {ft,v) is an ascent-direction. Alternatively, we can take the 
gradient of the objective function instead. Furthermore, one can make a line-search 
in the ascent-direction, e.g. by using the Amijo-rule. We skip the straightforward 
details. 

Theorem 3.4. Suppose g e C(M",R) and J e C3(]R",K). Then the QMC- 
Newton algorithms described in Table 3.1 and Table 3.2 being applied to the two fol- 
lowing cost functions 

(a) <^ : G ^M. in (3.2) with coefficients (3.12) and for the 

(b) '^ -.G^R defimed by 



1 ^ 
*(^' ^^-nT. /(^^- + *)5(^-)- (3-24) 



r = l 

with coefficients (3.23), 
are locally quadratically convergent around each nondegenerate critical point. 

Proof. To begin with, let us first focus on the cost function $. We observe that 
the algorithms in Table 3.1 describes a (/i, z/)— Newton algorithm on G = SE{n), 
where /i and v are defined in (2.5) and (2.6). Respectively, Table 3.2 applies a 
(/i, z^)— Newton algorithm on G = SA{n) where fi and i> are defined in (2.3) and 
(2.4). These parametrizations satisfy 

DfiMiO) = Dvm{Q), (3.25) 

for all M e G. Moreover, the cost function $ is in C^{G,R) since g G G(R",R) and 
/ G G^(M",Ili). Thus, we can apply the local quadratic convergence theorem from 
[13] or [9]: There exists an open neighborhood V d G oi each critical point M* £ G 
such that the point sequence {Mk}f.^^ generated by the algorithms in Table 3.1 or 
Table 3.2, converges quadratically to M*, if Mq € V . This proves (a). 

To prove (b), we use the fact that the process of differentiation with respect to 
M and the process of sampling in (3.22) commute. Hence, we obtain the gradient 
of ^ if we apply the approximation (3.22) to the right hand side of (3.11)-(3.10) for 
G = SE{n) and of (3.15)-(3.16) for G = SA{n) respectively. In the same manner, we 
get the Hessian in a critical point by approximating the left side of these equations. 
Therefore, using (3.23) in (3.13)-(3.14) and (3.17)-(3.21), respectively, yield again 
a (/x, I/)— Newton step and the local quadratic convergence is obtained by the same 
argument as for the function $. D 

A few remarks are in order: Firstly, the assumption that / is three times contin- 
ious differentiable is only needed to ensure the local quadratic convergence behavior. 
The QMG-Newton algorithms need only evaluations of the first and second derivative 
of /. Moreover, since the raw image data are usually given in a discreticed form, a 
previous interpolation step is needed to extend the function continously. If the in- 
terpolation order is high enough, then / e G'^(M",K) can always be guaranteed (see 
section 4.1). 
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Table 3.2: QMC-Newton Registration- Algorithm on SA(n) 

Step 1. 

Pick an initial guess Mo € SA{n) and set m = 0. 

Step 2. 

Calculate ai, Pij, 'yi,j,k, Sij,k,i and e^j for all 1 < i,j,k,l < n as defined in equa- 
tion (3.23). 

Step 3. 

Solve the linear system described in (3.17) - (3.21) with the unknowns Vi and i^ij, 

ihj) / {■n,n). 

Step 4. 

Construct the n x n matrix fi with the entries 0,i,j. In the case of [j, i) 7^ (n, n) use the 

solution of step 3. Else, set 

i'n,n — y ^'k,k 

k^n 

and compute 

where v'^^ is defined in (2.4). 

Step 5. 

Set k = k + 1 and goto Step 2. 



In [13], sufficient conditions for local quadratic convergence are: A C^ cost func- 
tion, condition (3.25) and the non-degenericity of the critical points. We will show in 
the appendix that the last condition is fulfilled for a generic choice of the images. 

Note, that the cost function (3.24) is only an approximation of (3.2); a more 
sensible choice would be the discretization of the least squares as 

N 

Y,{fiAxr+t)-g(xr))\ (3.26) 

r = l 

The above local quadratic convergence result would hold as well for suitably adapted 
choices of coefficients. 

The local quadratic convergence of the algorithms will be numerically confirnied 
in section 5. However, this property has more advantages the bigger the region in 
which the algorithms converge quadratically is. In Fig. 3.1 it is demonstrated that 
this region can be very small if / and g are interpolations of raw medical images. (The 
region of quadratic convergence is a subset of the region in which the cost function is 
convex.) A way out is to smooth the image up to a certain level and to registrate the 
smoothed image data instead. The result can then be taken as the initial guess for the 
algorithms applied to less smoothed images etc. Throughout this paper, we perform 
the smoothing by projecting the images to a finite dimensional function-space, namely 
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-m spline approx. 
-9C0coefficierrts 

1 coeffldeits 




Fig. 3.1. The first image shows a 2D-slide of a CT data-set of the head with 350 X 350 pixel. 
The second image is a spline-approximation with 900 basis- functions. The appearing artefacts may 
be caused by the Gibbs phenomena. The blue graph of the third image describes the cost function 
(3.2) by translating the image on the left against itself. For the green and red graph we replace the 
image by its spline- approximation with 900 and respectively 81 basis- functions. 



a spline-basis (sec section 4 for details). Fig. 3.1 shows the change of the objective 
function by varying the degree of smoothness. In all the examples we considered, the 
smoothing of the images yields an increase of the domain of quadratic convergence. 

4. Registration on Spline Function Spaces. If we consider an image as 
a function on the space of pixels, smoothing can be regarded as a projection to a 
suitable space of smooth functions, such as e.g. spaces of splines. We will show that 
this interpretation, compared to standard smoothing methods like Gaussian filters, 
has the advantage that we can exploit the reduction of information in the previous 
discussed Newton method while preserving a high degree of accuracy. A difficulty 
with this approach though is that the space of splines is usually not invariant under 
rotations, as happens for tensor product splines. In this section we work out the 
details in this setting and indicate a way how to avoid such a difficulty. 

4.1. Spline- Approximation of the Images. In this subsection we will de- 
scribe the projection of a row image data to a spline function space. A medical image 
/ can simply be seen as a mapping of vertices of a grid to gray values. 

/ : Z" ^ M. 

Of course, in all applications, / will only be defined on a finite set, but without any 
restriction, we can continue it by setting all other values equal zero. Furthermore, the 
Nyquist Sampling Theorem, (see for example [24],) allows us to extend / to a unique 
band-limited function on M" 

/ : M" ^ M. 

Of course, using this function in image processing exercises feeds to enormous nu- 
merical costs. Therefore we use a projection of / into a finite dimensional function 
space S. Defining this space, we employ the so called B-splines 



B°(x) 



if - i s: X s^ i 
else 



B^{x) := / B''-\s)B"{s - x)ds, 



extended in n-dimension by their tensor-products 

B'^ixu. ■ . ,Xn) ■■-- B'^ixi) ■ . . . ■ B'^iXn). 



(4.1) 



A Lie-Group Approach to Rigid Image Registration 15 

Fig. 4.1 shows the graphs of the sphnes in first and second order in one dimension. 
The most important fact of a B-sphnes for us is that it becomes smoother when the 
order is growing. The corresponding sphne function space can now be defined as the 
set generated by all integer-translations of (4.1) (cf.[28]) 



-fc 




Xri 



5^=<! >, CrB'^i^^n,...,^ 



cehy A e N. 



The condition c & I2 ensures that the L2-Norm of all functions in S^ exist. Note that 
we will only work with images that have a bounded support, which is the reason why 
we only consider finitely many linear combination of tensor product functions B'^. 
Therefore, the condition is always fullfilled. The additional parameter A G N is used 
in image processing to neglect high frequency informations of the images (cf. [33]). 
That is, if A is increasing, the projection ffi^ £ S^ of an image R becomes smoother 
(cf. Fig. 3.1). 

4.2. Registration of Spline Coefficients. We begin with a reformulation of 
the registration problem. For two given images f £ S^ and g £ S\ with coefficients 



c 



/ 



£ I2 the objective function (3.2) becomes 



<!>{A,t)= Y. ci4J Bl^{Ax + t)Bl^^{x)dx, 

with B^ y^{x) := B^ (^ — Ar). Since B^ ^{x) is a translation of Sg Ai ^^ obtain 
$(Af)= Y. ci4JBl^{x)Bl^{A{x + s)+t-r)dx 

« Y ci4JBl^{x)Bl^{x-A-\r-t) + s)dx. (4.2) 

The approximation in the last line needs some explanations: Of course, the ten- 
sor products (4.1) are not invariant under rotation. Assuming they are, we obtain 
approximations, by rotating the argument of the second spline around the barycentre 
t — r in such a way that the argument becomes a simple translation in the direction 
A^^ir — t) — s. In the cases where A is close to the identity (which is true for most 
of the medical image problems) the approximation error tends to zero. Our examples 
show that this simplification does hardly influence the solution of the optimization 
problem. 

Therefore, the convolution of two splines 

F{s):= f Bl^ix)Bl^{x - s)dx, 

I (4.3) 

= Bl^is^) ■ . . . ■ Bl^isr,) 

is related to the optimization problem. Actually, F(s) is, by definition, the tensor 
product of B-splines of order four that is well studied in literature [33]. For us, the 
most important fact of such B-splines is that they are three times continuous differen- 
tiable, which makes (4.2) adaptive to the Newton-algorithm (3.1). Furthermore, F{s) 
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^ 


-"■""^^''dT 


"~^^^ 





b" 


\ 


\ / y^ 
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d 




/ 




(fe 




/ 






V-10 


/ 




di 




/ 


^^~ 


<i7? 












. 



B^{x) 



3^4(5 + 2:^)4 

^(55 - 20a; - 120a;2 - SOx^ - IGx*) 

^(115 -120x2 + 48x4) 

g^(55 + 20a; - 120^^ + SOa;^ - IQx^) 

i(-5 + 2a;)4 





for 
for 
for 
for 
for 
else 



_3 < T < — - 

2 ^ ^ 2 

i < X ^ I 

2 < X S; 2 



Fig. 4.1. The first image shows the B-splines of order zero, one and two. The second image 
shows the B-spline of order four, with its first and second derivative. 



is piecewise polynomial with degree four, so the values and the derivatives of the func- 
tion can be calculated very quickly and without additional numerical approximations. 
To sum up, the modified image registration problem becomes 



max V c{ciF{PM-^f- s). 



mgG 



s,r£\Z" 



To calculate the Newton-step for this optimization problem, we apply the setting 
in section 3.2 for the objective function 



$(Af ) := Y^ c{4F{PMf ~ s) 



(4.4) 



and obtain an analogous formula of (3.3) for the gradient V($ o /ij\/)(0, 0) := {^,v) 
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with 



n 



Y^ 44TTt{\/F{Ar + t-s){Ar + ty) 
Y^ 44VF{Ar + t-s). 



(4.5) 



Similarly, formula (3.4) for the Hessian operator i/ess$opM(0)(il,-y) = {^,v) is re- 
placed by 



I Y 44(n'^VF{Ar + t~s){Ar + ty 



Cl = TTf 



+ VF{Ar + t - s)iAr + t)^n'^ + 2v{Ar + i)^Hi.(^r + t-s) 



2HF(Ar + t - s)n{Ar + t){Ar + t)^ 
: Y c44IiF{Ar + t-s){n{Ar + t) + v) 



(4.6) 



Using the same calculations as in section 3 we end up with two new Spline- 
based Newton Registration- Algorithm (SB), one is acting on SE{n) and one on 
SA{n). Both algorithms have almost the same steps as their continuous counterparts, 
the QMC-Newton on SE{n) and on SA{n) respectively, which is the reason why we 
don't present them here once again. The only difference in the algorithms mentioned 
before is the calculation of the coefficients a, . . . , e which is done in step 2 of each 
algorithm. (See Table 3.1 or Table 3.2) 

In the case of the splinc-bascd algorithms, these coefficients have the following 
form: 

dF 



= Y. ci4--{Ar + t-s) 



s,rG4; 



lid-k^ Y 44{Ar + t), ^^ ^^ {Ar + t- s) 



' dxjdxk 



kj 



fc,, = Y 44{Ar + t),{Ar + t) 



dxkoxi 



E 



^ dxidxj 



(Ar + t-s). 



One may argue that the above presented SB-Newton algorithm requires a high de- 
gree of differentiability of the images, as it is assumed in the QMC-Newton algorithm. 
Exactly as in Theorem 3.4 the only condition on the images / and g to show the local 
quadratic convergence of the SB-Newton algorithms (Table 31 or Table 3.2) is that 
the cost function $ in (4.4) is in (7^(6*, M). However, here, the construction of $ via 
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splines reduces the requirements of differentiability on / and g. Explicitly, for f Cz S^ 
and g G S^ this condition is fuUfilled if and only ii p + q ^ 3. Thus, for p = 2 and 
5 = 1 we have / £ C^(]R",R), g € C(IR"',M) and get the local quadratic convergence 
of the SB-Newton algorithm in the same way as it was shown in the QMC-Newton 
algorithms (where / £ C^(]R",]R) is needed). Moreover, the SB-Newton algorithms 
need no evaluations of the derivatives of / or g. They work directly on the given 
coefficients, given by the spline representation of the images. 

4.3. Extension to Mutual Information. Following the approach of Viola [36], 
the calculation of the mutual information of two images / and g is done in two 
steps. Firstly, one has to calculate f{xi) — g{xi) for a couple of supporting points 
{xi\i^j C K" to give an estimation of the joint density Pf^g. Afterwards, the integral 
/ Pf^g log pf^g is approximated numerically - usually by a Monte Carlo method. 

Let two images f G Sf and g G S^ be given. We estimate the joint density by 
using the coefficients c£ and c^ of / and g: 



Pf,gW 



^ ,B^i±^M^^\. (4.7) 



nG^ 



Here, cr G M controls the approximation error of pf^g (, ef. [2] p. 88-95), and B^ 
denotes the cubic B-Spline in one dimension (cf. Fig. 4.1). In image registration we 
are especially interested in the case in which g is deformed by an element M € G. 
Therefore, we have to replace c^ in (4.7) by the coefficients c^ oi g{PMx). However, 
in general, g{PMx) is not in S\ for an arbitrary M G G and a additional projection 
step is needed to get c^. We make the same simplification as in the previous section 
and get 



A t 



1/ ^^ 






cl-iu^rF{A-\r-t) + s). (4.8) 



with suitable ^u.r S R- The weights ^u,r are necessary, since the translations of the 
splines do not form an orthonormal system. They can be calculated explicitly and 
independently of/ and g. Substituting c^ in (4.7) by (4.8) provides an explicit formula 
for the objective function $ : G ^- K: 



$(A/) := I q\ 5^ frS^ ( ^ (^" cg(M)) \ j ^^^ ^^^^ _ -xlogx (4.9) 

In the sequel, we use the Simpson-rule to approximate the entropy /p/,g logP/,g- 
Following [36], we have to search the maximum of $. Since (4.9) is an three times 
differentiablc function, we can use the same techniques used in section 3.2. We get: 



'^w^^EH'^ E ^'(^-^^^-^'W)) I (4.10) 
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with 



p ^-^ \ ^-^ \o-p a 



ci-cUM)) 






and 

Hess$opju(0)(O,v) 

+ -E^'f- E B^(--kci~£mi)) 
ptz \ .,'t;„ Vp ^ 



ap a 



«GiZ" 

Where i?'^ and B^ denote the first and second derivative of B^ (and the same holds 
for g' and g"). 

Before writing the corresponding Newton step in components, we need some sub- 
stitutions: 

\ap a J 

K."-B^"{^^-'-ici-ciiM))) 

'l-9"L E ^'(^-^^^-^^"(^^)))) (4.11) 



(?fc 
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and 

Cr,s,i =2_/ (fi'fe^M.fe" + 9kbu,k''^) iVu)ti^u)r,s 
k,u 

"^r.i =X! {Sk^^,k" + 9kK,k'^) {Vu)i{Vu)r 

k.u 

- / ^ \Q'lzhu,k + Qk^u,k ) (^^i)zj(^u)r,s 



r.s,i.j 

k.u 

0k,us ^g'kbu,k'{vu)i (4-12) 



^k,u,i,j — Qk"u,k ("uji 



Here, (f;ti,r2„) denotes the gradient of the function c^ o fi]\.j in zero. Note that 
c£ o /^j\/ is a function of the form (4.4). Hence, the gradient and the Hessian in zero 
are aheady calculated in (4.5) and (4.6). Therefore, the ith component of the vector 
part of the Newton equation becomes 

Y^ QkKk (HeSSgSo^^^ (0)(f7, W))^^^j„^^^ + ^ Cr,s,i^r,s + ^1 ^^^^^'"^ = ^ XI ^'^'"'^ 
k.u r,s r k,u 

and the (i, j) component of the corresponding matrix-part is 

^ g'f,hu,k' (HeSSeJopjv, (0)(^, v)) ^^^^^^j + Y^ ^r,s,i,j^r,s + ^ J.r^r ^-^j ^k,u. 



i-.J- 



r.s r k 



;,u 



To expand these two equations in full detail, we make the same calculations as in 
the algorithms before and set 

ttj = ^6i^&„,t,' Y 4lu,r-^{Ar + t-s), 



h,j = Y b'iK,v' Y ^iln,r-^{Ar + t - s){Ar + t)j, 



s^r(^-. 



s,rG ^ 



s^r(^-. 



dxi 

dF 
dxt 

dxjdxk 

dxidxk 



7»j,fc=X!^^^"."' Yl cf7„,^(Ar + t),-— -— (Ar + t-s), (4.13) 



Q2p 



Now we have all necessary instruments to present the Newton equation. In the 
case of the rigid registration we end up with the following analogue to the linear 
system (3.13)-(3.14): 

Lemma 4.1. Let (51, u) G so{n) x M" be the Newton-direction for the objective 
function (4-10) in a certain point M G SE{n). Then the components ^k,i ,^ ^ k,l ^ 
n, of fl and vk 1 ^ fc ^ n, of v satisfy 

X](7^fe,i - lk,iA + C,kj,i - CLks)^kd + Y,^^^-^ + ^k^^k = -Yj ^'^^"■^ (4-14) 

k^l k k,u 
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Table 4.1: Mutual Information-based Registration- Algorithm on SE(n) 

Step 1. 

Pick an initial guess Mq E SE{n) and set m = 0. 

Step 2. 

Calculate all coefficients in (4.11), (4.12) and (4.13). 

Step 3. 

Solve the linear system described in (4.14) and (4.15) with the unknowns Vi, 1 ^ 

i ^ n and ^ij-, 1 ^ i < j ^ n. 

Step 4. 

Construct the n x n matrix O with entries fli_j. li j > i use the solution of step 3 

or else set 



ihj 



( — Oji for j < i 
for j = i. 



Compute 



Mm+i:^i^f/'{n,v), 



where v^^ is defined in (2.6). 

Step 5. 

Set TTi = TO + 1 and goto Step 2. 



for all 1 ^ i ^ Ji and 



k>j k<j k>i 

9 2^(/^j,fc + Pk,j)^i,k — 2_^(^i,k,l,j ~ ^j,l,k,i + ^id,k,j — Si,k,l,j — ^k,Li,j + ^l,k,i,j)^k,l 

— 2^(7j,fc,i — 7^fcj — Ci,3,k)vk = - 2_^ ''kM,i,, 



2 

k<i k>l 



3- 
k,u 

(4.15) 



for all 1 ^ i < j !^ m. 

We finish this section with the corresponding lemma in the case of volume- 
preserving transformations. 

Lemivia 4.2. Let {ft,v) G sl{n) x M" be the Newton- direction for the objective 
function (4-10) in a certain point M G SA{n). Then the components flk^i, 1 ^ k,l ^ 
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Table 4.2: Mutual Information-based Registration- Algorithm on SA(n) 

Step 1. 

Pick an initial guess Afo G SA{n) and set m = 0. 

Step 2. 

Calculate all coefficients in (4.11), (4.12) and (4.13). 

Step 3. 

Solve the linear system described in (4.16) - (4.20) with the unknowns Vi, I ^ i ^ 7i 

and ftij., 1 ^ i,j ^ n and (i,j) 7^ {n,n). 

Step 4. 

Construct the nxn matrix 51 with entries ^2^^. If {j,i) ^ in,n) use the solution 

of step 3 or else set 



and compute 



Mr^+,:=,,ff'in,v), 



where v^^ is defined in (2.4). 

Step 5. 

Set m ~ m + 1 and goto Step 2. 



n, (fc, /) 7^ (n, n) of fl and Vk I ^ k ^ n of v satisfy for each 1 ^ i ^ n 

2_^ill.kA + Ck,l.i)^k,l + 2^(7fc,fc,i — 7n,ri,i + Ck.k.i ~ Cn,nA)^k,k 
k^l k^n 

+ ^{etM-Vk.t)vk^-Y,^k,i,t- (4.16) 

k k,l 

and for all 1 ^ i, j ^ n, {i,j) 7^ in,n) the following equations: 

/ ^ Pi^k^j^k + 2_^ l^kj^k.i + 2_^ i^3.l,k,i + ^l,k,i.j)^l,k 

k k (k.l)^(n,n) 

(4-17) 

— 2_^i^J^k,k,i — '&k,k,i,j)^k,k + 2^(7j,fc,i ~ Ci,j,k)vk = — 2_^ ''k,l,i,j 
k^n k k^l 

fori ^n, j ^n, i ^j, 

2_^/3n,fc%.fe + 2^ /3fc,j^fc,n + 2^ {Sj,l,k,n + '^l,k,n,j)^l,k 

k k^n (k,l)^in.n) 

1 , y (4-1^) 

k^n k k,l 
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Fig. 5.1. Consideration of convergence speed for a 5^4(2) problem. Middle: QMC-Newton 
algorithm, for several number of base-points. Right: SB-Newton algorithm for several spline- 
approximations of the images. 



for i == n, j ^ n, 



y ^ PiM^n,k + 2_^ Pk,n^k,i + ^^ {5nd,k,i + '&l,k,i,n)^l,k 

k=in k (kd)^{n,n) 

1 . \ (4-19) 

■ 2^('5n,fe,fe,i — '&k.k.i,n + T:Pi,n)^k,k + 2_^{ln,k.i — Ci,n.k)vk ~ — 2_^ ''k,l,i,n 
k^n k k,l 



for j ~ n, i ^ 11 and 



ll.l,k Vk 



(k,l)^{n,n) 



Y^ kk^3,k + Y^ Pk.J^k,^ + J2 ^j'fc^' " ^^'3'k --J2 
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k^n \ " V m / / 

■^ n ^-^ j 

in / 



(4.20) 



E 

k.l 



fori^j, i,j ^ n. 

With this lemmas at hand, one can easily check as bevor that the algorithms 
presented in Table 4.1 and Table 4.2 are locally quadraticaly convergent. 

5. Experimental Results. All computations in this section are performed us- 
ing MATLAB R2008a on a 1.9 GHz laptop with 2 GB RAM. In our first example (Fig. 
5.1), we demonstrate the local quadratic convergence rate of the described algorithms. 
We took a 2D cross-section (250 x 250 pixel) of a CT data set. The reference image of 
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our artifical problem is the corresponding spline-approximation, with 289 coefficients. 
The template is identical to the reference. In this example we always start with the 
initial transformation 



.2 


0.5 
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Afo = 0.8333 20.8 . (5.1) 



(Figure 5.1 left buttom shows the original image transformed by this matrix.) 
The middle of Fig. 5.1 plots the distance of the limiting point using the registration 
algorithm on SA{2). To calculate the coefficients (3.12) we use the Quasi Monte Carlo 
approximation with the 500, 1000 and 5000 first elements of the Halton sequence [6]. 
In all three cases we see a local quadratic convergence and in only 12 steps we achieve 
an accuracy < 10~^^. Since the objective function (3.24) is a discretized and approx- 
imated versions of the correlation- measure (3.2), there is a discrepancy between the 
exact and the detected transformation. That is, the limiting point of a particular 
iteration in Fig. (5.1) is close to but not equal to identity matrix. For example, for 
5000 base-points, the algorithm ends with 
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which is not very close to the optimal solution A/ = I^. (However, this gap would 
not appear in a discretized version of the "sum of squared difference" .) The graph on 
the right of Fig. 5.1 shows the speed of convergence of the spline-based registration 
algorithm. Here, the reference image is the spline-approximation of the original image 
with 225, 529 and 2304 coefficients respectively. The template is again identical to the 
reference and the initial transformation is again (5.1) for each experiment. Once more, 
we obtain a local quadratic convergence. In contrast to the Monte Carlo version of 
the registration algorithm, the limiting point of this SB-Newton registration is much 
closer to the identity matrix: In the case of 225 spline-coefficients the algorithm end 
with 

/ 1.0008 -0.0004 -0.043 
M2a = -0.0002 0.9992 0.12 
\ 1 , 

In case of 2304 coefficients we achieve a distance to the identity matrix of 3.7 • 10~^. 
In our next example, we examine the convergence of the SE{3) algorithms. We 
consider a 250 x 250 x 20 CT data set of a head (see Table 5.1). In contrast to the first 
example, we first transform the image and then make use of the spline-approximation 
to achieve the template (which is much closer to a natural registration problem). The 
used transformation is 
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(5.2) 



which is consistent with a rotation of 17.2 degrees around the central principal axis 
of inertia. In Table 5.1 we note the difference between the detected and the requested 
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Table 5.1 
Comparison of the SB-Newton with the QMC-Newton algorithm. 
We register the distance from the requested transformation to the result 
of the particular algorithm. 




Number of 
CoefEcients 


SB-Newton 
Algorithm 


QMC-Newton Alg 


orithm 


2000 


6000 


10000 


11 X 11 X 18 


0.0045 
0.45 


0.075 
9.5 


0.023 
2.3 


0.0056 
1.5 


16 X 16 X 18 


0.0016 
0.15 


0.013 
4.5 


0.017 
1.7 


0.014 
2.1 


24 X 24 X 18 


8.8-10-^ 
0.47 


0.042 
5.6 


0.0042 
0.49 


0.0036 
0.7 


50 X 50 X 18 


1.1- 10"" 
0.4 


0.39 

52 


0.0026 
0.3 


0.0027 
0.19 



transformation: The first number in each box gives the distance of the detected to the 
real rotation matrix in the Frobenius norm, the second number the Euclidean distance 
of the detected to the real translation in pixel length. Note that a translation error 
smaller than 1, which is the size of one voxel, can be seen as perfect. We vary the 
number of coefficients used for the spline-approximation and the number of elements 
of the Halton sequence used for the Quasi Monte Carlo Method. 

We recapitulate: For the algorithms defined in section 3.1.1 we need two approx- 
imations: A spline-approximation to smooth the image and a couple of base-point 
to approximate the integrals (3.12). Also two approximations are made in the algo- 
rithms of section 4: The spline-based image smoothing and the approximation of the 
objective function (4.2). In case of the SB-Newton algorithm we achieve a overwhelm- 
ing accuracy even for very strongly smoothed images. For a given smoothing level 
the operation on the spline function space seems to be superior to the Monte Carlo 
approximation. On the other hand, the speed of the Monte Carlo based algorithms 
is hardly influenced by the level of image smoothing. It depends nearly completely 
on the chosen number of base-points for the integral approximation. Therefore, a 
comparison of the algorithms focused on the rows of Table 5.1 is limited. Comparing 
the columns of Table 5.1 could lead to the impression that we achieve better results 
by rising the number of spline-coefficients. This is not true generally: For example, 
by reducing the smoothing local extrema appears, which leads to wrong results. This 
happens, for example, in the last row of Table 5.1 for 2000 base-points. 

We have already mentioned that the speed of a QMC-Newton step depends lin- 
early on the number of base-points. In contrast the velocity of the spline-based version 
depends in a linear manner on the number of spline-coefficients of the reference. (To 
see this we make use of the fact that the function F in (4.4) has a compact support.) A 
spline-based Newton step on 24 x 24 x 18 spline-coefficients take 5.3 seconds. Whereas 
a Quasi Monte Carlo-based Newton step for 6000 base-points takes only 1.9 seconds. 
However, this value depends extremely on the way of implementation, especially on 
the use of preimplemented MATLAB methods. 

We proceed with a comparison to an established intensity-based registration 
algorithm. We choose the Levenberg-Marquardt-like (LM) algorithm described by 
Thevenaz et al (cf. [32]), since the same framework requirements are chosen to for- 
mulate the problem. Above all, they also make use of the spline-approximation of 
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Fig. 5.2. Comparison of the SB-Newton, QMC-Newton, Levenberg- Marquardt (LM) and Se- 
quential Quadratic Programming (SQP) algorithms. We choose a 2D-CT slide ('256 X 256 pixel) as 
the reference and its translation along 20 pixel as the tem,plate (cf. Table 5.2). We plot the distance 
of the calculated transformation to the particular limiting point. 



the images several times. In their approach, they use the substitution-rule in (3.26) 
to get a good approximation of the derivative of the discretization (3.22) to achieve 
low computational costs. We also compare our algorithm with a classical, extrinsic 
Sequential Quadratic Programming (SQP) algorithm. (See e.g. [4] chapter 12.4 for an 
introduction.) In this method, the objective function is approximated quadratically 
in every step, while the constraints are linearized. The resulting Newton-step is the 
performed in a vector-space of bigger dimension. 

Just like in the beginning, we take the previous 2D-CT slide as reference and a 
translated and rotated version of of it for the template. In Fig. 5.2 we demonstrate 
the convergence-behavior of the SB-Newton, the QMC-Newton, the LM and the SQP- 
algorithm. We plot the logarithm of the distance of the actual guess Mi to the 
particular limiting point Mg^, ^^qm: ^lm^ ^sqp- ^'^^ ^^'^'^ algorithm we use the 
same number of coefficients for the spline-approximation. Even though SB-Newton 
and QMC-Newton algorithms seem to dominate, we have to keep in mind that a 
difference in translation under 10*^ pixel is still an excellent result, which is attained 
by all four algorithms after 16 steps. But in contrast to LM we can choose a classical 
first derivative-based stopping criteria for the other algorithms: 

||V$(Af,)|| < T 

where T is an arbitrary threshhold. Because of the result in Fig. 5.1 we can choose T 
arbitrary small and, for sure, the stopping criteria will be achieved without running 
unnecessary many iteration steps. Therefore, no user-specification is necessary to get 
results of interest. 

Another result from Fig. 5.2 is the superior convergence rate of the SB-Newton 
and the QMC-Newton algorithms compared to the SQP method. This stresses the 
point of view that intrinsic algorithms dominate extrinsic algorithms if the constraints 
form a differentiable manifold. By embedding the set of admissible points in a vector 
space, which is done in an extrinsic algorithm like the SQP, the dimension of the 
optimization problem may explodes. As an example, the SQP method searches for 
the optimum of a function / : SE{3) ^- R in a space of 18 dimension (9 because of 
A E gl{i) instead of A e 5*0(3), 3 because of f G M^ and 6 parameters are needed 
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Requested 
Transformation 


Detected Transformations | 


SB-Newton 


QMC-Ncwton 


LM 


SQP 


0°, O 


0°. (2°o) 


0.063°, Gro",) 


0-012°, iZn) 


0.063°, {,°-l,) 


0°, O 


0.023°, Co%^^) 


0.092°, Clf^^) 


0.608°, CJlZ) 


0.092°, (3^»-«*) 


17.188°, (°) 


17.296°, (ifg) 


15.564°, Crol) 


16.506°, („Vo^,) 


15.564°, (:0:g^) 






Table 5.2 
Comparison of the SB-Newton, QMC-Newton, LM and SQP algorithms. The first column 
presents the requested transformation and the following columns the calculated transformations with 
the particular algorithms. The first number gives the rotation around the center of the image, the 
following vector gives the translation in pixel. The registration is done after a spline- approximation 
with 25^ coefficients for each algorithm. Below the table, we present the templates of the particular 
registration problem. The reference is equal to the one in the first experiment. 



for the Lagrange multipliers), whereas the SB-Newton and QMC-Newton methods 
optimize over a space of 6 dimension (3 because of ^ g 5*0(3), 3 because of i e K"^). 
Hence, extrinsic algorithm may need unnecessarily many steps, a higher complexity 
and finally, a projection step is needed in order to make sure that the result is an 
admissible point, since only the limiting point of an extrinsic algorithm is guarantee 
to be admissible. 



Since the algorithms use different kinds of approximations, the particular limiting 
points diverge. We listed them in Table 5.2 for a few registration problems. Since we 
used the same cost function for the QMC-Newton and the SQP algorithm it is not 
surprising that the detected transformations are equal for both methods. It is notable 
that in the second row the detected translation in the case of the LM-algorithm is 
wrong. The main reason for this is that LM minimize the "sum of squared difference" - 
measure while the QMC or SB-Newton algorithm maximize the correlation between 
two images. Of course, one can pass by such missleadings through reshaping the 
region of interest or through starting at a coarser scale of spline-approximation, but 
this option is also applicable for the other algorithms. 

In the next experiment, the influence of noise to the result of the SB-Newton 
algorithms is studied. As before, we take the upper left image in Fig. 5.1 as the 
reference / with 250 x 250 pixel and gray value in the range [0, 932]. To construct the 
template image g, we perform a rotation of the reference around its center with 11.5° 
and add gaussian noise from 5% to 50%. (I.e. the variance of the noise is between 0.05 
and 0.5 after rescaling the range of the image to the interval [0, 1].) The left-hand side 
of Fig. 5.3 shows the template perturbed with the biggest noise level. For each noise 
level we project the reference and the template image to a spline function space and 
use the SB-algorithm to detect the transformation. For the detected and the exact 
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quadratic difference of the gray values per pixel 




0.2 0.3 04 

level of gaussian noise 



Fig. 5.3. Left: Template image with 50% gaussian noise. Right: For the gaussian noise from 
5% to 50% we plot the mean and the 90% confidence interval of the error (5.3), detected by the 
SB-algorithm. 



transformation M^^^^^t, Af exact, we measure the discrepancy as 

^ Y, (/(^Mdctocti«) - /(PAfex.cti«))', 



2502 



(5.3) 



where the sum is over all pixel of the image. This can be seen as the averaged 
quadratic difference of the gray value. For each noise level we consider three different 
cases of spline function spaces, with 15 x 15, 23 x 23 and 48 x 48 coefficients. We 
repeat each experiment 50 times and evaluate (5.3) for each detected transformation 
-^detect- On the right-hand side of Fig. 5.3 we plot the appropriate mean and the 
90% confidence interval. As in the previous experiments, we observe a systematical 
error caused by the spline-approximation. In comparison to this, the additional error 
caused by the gaussian noise is negligible small, even for big variances. This is not 
surprising, since the spline approximation of the image is performed with respect to 
the 'sum of squared difFcrcncc'-norm, known to be the unbiast estimator in the case 
of gaussian noise. Another result of this experiment is that the systematic error of 
the spline-approximation is very small (the range of the images is [0, 932]) as we haves 
already pointed out in Table 5.1. This underlines a known fact in image processing, 
namely that we lose less information of an image by a spline approximation than by of 
a standard interpolation method (cf. [33]). Moreover, if we want to detect the exact 
transformation perfectly, we have to incorporate the algorithms in a pyramidal ap- 
proach in which we gradually increase the dimension of the spline function space. This 
procedure is quite natural and already implemented in various registration algorithms 
(see e.g. [27] and the references therein). 

It turns out that comparing the numerical costs of the algorithms is a difficult 
task: In [32] the authors overcome the necessity of evaluating the gradient of one image 
in each iteration, but the numerical costs are of the order N , since they sum over all 
N pixel on the image. In our QMC-approach we sum over a uniform distributed grid, 
which gives a certain (but hopefully negligible) error, which reduces the computational 
costs to 0((logiV)2/A^) in the case of 2D images. We want to point out that reducing 
the sum over all pixel to a sum over a grid with low discrepancy is also applicable 
for the LM-algorithms and leads to quite small approximation errors as in the QMC- 
Newton case. For the SB-Newton algorithm the computational costs are of the same 
order as the number of the spline-coefficients from the approximation. On the one 
hand, this number is often extremely small in comparison to the image size, since the 
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Fig. 5.4. The first row shows the initial position of the template with respect to the reference. 
The second row shows the astimated relative position after 12 quasi- Newton- Steps. We used a 
CBCT-image for the reference and a FBCT-image for the template. 



results of the SB-Newton algorithm are more than sufficient even in a very course 
level of spline-approximation. On the other hand, for each spline coefficient, a series 
of B-splines has to be evaluated at different numbers (cf. 4.4), which is why the costs 
per spline-coefficient are high. Therefore, the SB-Newton algorithm becomes very 
slow for a fine level of spline-approximation. 

Now, we demonstrate the algorithms with the help of some medically relevant pic- 
tures. In Fig 5.4, reference and template are FBCT and CBCT shots of the prostate 
area. These two datasets consist of 420 x 420 x 72 and 512 x 512 x 101 voxels re- 
spectively. We will not use a priori information about their relative positions to each 
other. Instead, we dispose the reference on the lower part of the template (see also 
the first row of Fig. 5.4). 

The beginning of the registration consists of comprising the data set to 26 x 26 x 25 
spline coefficients each. The second row of Fig. 5.4 shows the result of the registration 
after 12 SB-Newton steps. Any further steps only provide translations below the size 
of a voxel and rotations under 0.01 degree. 

Due to the great similarity between the two recording methods the SB-Newton 
algorithm on SE(n) was used for the registration in the last example. With the help 
of the procedure described in section 4.3 one is also able to use the algorithm for data 
which have been recorded with very different modalities. We will show this in the 
next example by comparing a MR picture with a CT (each with 512 x 512 Pixels) 
by using a PET- picture (128 x 128 Pixels) and used the "Mutual Information-based 
Registration Algorithm on SE(n)". The first row of Fig. 5.5 shows the origin of the 
correlating registration problems, and the second one reveals the result after 10 steps. 
We used a compression of 64 x 64 spline coefficients for each case mentioned. In the 
last column of Fig. 5.5 we more-over add an additional rotation of 30 degrees. Once 
more, we have reached a good match of the pictures within 10 steps. We already 
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Fig. 5.5. The first row shows the initial situation of three registration problems. In the first 
one we compare a CT- with an NMR-image and in the second and third, we compare NMR- with 
PET-images. The second row presents the result of mutual information-based algorithm after ten 
quasi- Newton-steps. 



pointed out that this method only provides a loeal eonvergence. If the pictures differ 
in their origin - the algorithm wont provide an acceptable result even if tried using 
many steps. This happens if we rotate the PET- picture as seen on third column in 
Fig. 5.5 more than 30 degrees. 

6. Conclusions. We developed a novel framework for the rigid or volume- 
preserving registration of two real valued functions. For this we used a modified 
(//, z^)— Newton algorithm to solve the corresponding optimization problem on the 
manifolds SE{n) or SA(n). The local parameterizations of these manifolds are chosen 
in such a way to get a very efficient and easily implementable algorithm. Addition- 
ally, we proved the local quadratic convergence of this method under suitable generic 
conditions. 

In order to apply this framework to the image registration task, we offered two 
strategies. The QMC-Newton compares two images in a sequence of points with low 
discrepancy. The appearing cost function could then be easily approximated by the 
Quasi Monte Carlo method. The SB-Newton strategy uses B-spline approximation of 
the images. Here, no image evaluations are necessary, the algorithms operate directly 
on the compressed (jpeg-like) data. Our numerical tests showed that both strategies 
preserve a high accuracy even in the case of high compressed data. Additionally, we 
confirmed the local quadratic convergence of both methods in numerical experiments. 
Comparatively, the QMC-Newton step is done in less computational time than a SB- 
Newton step - at least in our implementation. But it turned out that the second 
method has a higher accuracy in detecting the requested transformation. 

7. Appendix: Generic convergence conditions. In Theorem 3.4 we proved 
the local quadratic convergence of the QMC-algorithm on the condition that the 
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critical points of the cost functions arc nondegenerate. In this appendix wc wiU 
show that this condition is gencrically fulfiUed in image registration, i.e. on very 
mild conditions for the reference image g G C'^(M", M), the set of all template images 
/ e C''(M",M) for which the algorithm converges locally quadratically is open and 
dense in C'^(R"',R) in terms of the strong topology*. 

To begin with, we have to introduce some aspects of the transversality theory. 
We refer to [12] for a more detailed discussion. Let M, N be manifolds and A C N a, 
submanifold of N. A differential map f : AI —> N is transverse to A and one writes 
/ rti A if 

Ay + T,f{M,) =. Ny 

whenever f{x) = y E A. That is, the tangent space of A^ at y is spanned by the 
tangent space of A at y and the tangent space of M at x. The following two theorems 
are well known in transversal theory, their proofs can be found e.g. in [12]. 

Theorem 7.1. Let f : M ^- N he aC^ map, r ^ 1 and A C N aC^ submanifold. 
If f is transverse to A then f^^{A) is a submanifold of M . The codimension of f~^(A) 
is the same as the codimension of A in N. 

Theorem 7.2 (Parametric Transversality) . LetV,M,N be C^ manifolds without 
boundary and A d N is a C" submanifold. Let F : V ^>- C^{M,N) satisfy the 
following conditions: 

(a) the evaluation map F'^'" :VxM-^N, {v, x) i~> Fy{x) is C. 

(b) F™ is transverse to A. 

(c) r > max{Q, dim N + dim A — dim M}. 
Then the set 

fti {F]A) := {u e V^ I K, rti A} 

is residual and therefore dense. If A is closed in N and F is continuous for the strong 
topology on C'^{M,N), then rtl {F,A) is also open. 

To give the degenerated critical points a geometric interpretation, we also have 
to introduce the manifold J^{M,N) of r-jets of functions from M to A^. A r-jet 
from M to A^ is an equivalence-class [x, /, U]r of a triple in which U C M is an open 
subset, X £ M and / : J7 ^ Af is an C" map. We say that two triples [x, /, U]r and 
[a;', /', [/']r are equivalent if a; = x', and / and /' have same derivatives in x up to 
the order r. We use the notation jj^f := [x, /, U]r to denote the r-jet of / in a; and 
j^ f : M — >■ J{M,n) defines the r-prolongation map x H> j^/. The following theorem 
can be found in [12]. 

Theorem 7.3 (Jet Transversality Theorem). Let M, N be C°° manifolds without 
boundary, and let A C .I^{M,N) be a C°° submanifold. Suppose 1 ^ r < s ^ oo. 
Then 

(h' {M,N;f,A) :- {/ e C^M,N) \ ff rtl A} 

is residual and thus dense in Cg{M, N), and open if A is closed. 

Let us now come back to the Lie Groups G = SA{n) and G = SE{n) respectively. 
In the manifold J'^{G,M.) the subset 

il= {jlh I heC^{G,R), xeG, Vh{x) =0, detHess/,(a;) =0} 



*For a definition of the wealc and strong topology of function spaces wc refer to[12]. 



32 M. SCHROTER, U. HELMKE, O. SAUER 

contains all degenerated critical points. Furthermore, the set il is a finite union of 
submanifolds it = ili U . . . U U„i with 

dimili s^ . . . < dimilm = - dimG + -(diniG')^ 

Since we want to study the influence of the images /, g on the cost functions ^ and $ 
of formula (3.2) and (3.24), we introduce the linear maps *, $ : C3(M", M) -)■ C^{G, R) 
which are defined by 



1 ^ 
^{f){At):^-Y,g{x.,)f{Ax,+t) 



i=l 



$(/)(At):= / g{x)f{Ax + t)dx. 

For a given function / e C'^(R", R) the cost function ^(/) has no degenerated critical 
points if, and only ii j^'^{f) misses il. Since 

dimJ^{G,M.) = - dim G+i(dim 0)2 + 1 

we get 

dimG + dimili < dim j2(G,R), i == l,...,m. 

By means of Theorem 7.1 we get that j^^{f) misses il if, and only if P'^{f) is 
transverse to every submanifold il^ i = 1, . . . , ?7i. The same statement applies to the 
function $. 

Theorem 7.4. Suppose that the template image g G G'^(M",M) has compact 
support and satisfy the following conditions: 

a) In the optimization problem, (3.2), g is not identical to zero. 

b) In the optimization problem (3.24), there exists k = ^in + l)(ri + 2) elements 
of the sequence {a^iji^i which do not do not lie on a quadric hypersurface and 
g{xi) 7^ for all these k elements. 

Then the cost functions $ and 'f have no degenerate critical points, for a generic set 
of reference images f G G|(M",M). 

Proof. Following the argumentation above, we define 

A*., :=ftl3 (R",R; j2^,ilO := {/ G C^{R'',R) \ fi'{f)(hii,} 

and ^$_i :=fti'^ (M", M; j2<l,il,i) respectively. We have to show that A^,i and A<^^i are 
open and dense in G| (R" , M) . 

Using the Jet Transversality Theorem provides for the openess of ftl^ (G,M; j2,ili). 
Since *& and $ are continuous maps, we conclude that the sets A<s,^i and A$^i are open. 
Hence, we only have to show the densness of both sets. Therefore, we define 

A*,,,, :=rtl3 (M",R;j2*|^__(o),il,) and A$.,,, t^rtl''' (M",R; j2|.|^^(o),U,) 

respectively, while ^^^(0) is an open ball with radius r in G with respect to the 
Frobenius norm. Again, we can use the Jet Transversality Theorem and the continuity 
of <t and ^ to show that A$^i ,. and Aip.i ,, are countable intersections of open sets. Due 
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to the category theorem of Bairc and the fact that A^ ^ = nr=i ^*,i,)-i it is enough 
to show the denseness of A^^ ,, with respect to the strong topology. An analogous 
argumentation for $ shows that the denseness of A^i^ is sufficient to complete the 
proof. 

Now, let K,L C M" be two compact subset with 

{yeM.'"\y = Ax + t,xe supp(5), {A, t) e AV(0) n G} C A' and K CL 

and / e C'^(R", R). Suppose there is a sequence {/fc}j.gp^ C A^_i_r which converges to 
/ with respect to the weak topologie. Then we can construct a sequence {ft.fc};.g|!j C 
A^i^i.r with hk{x) — fk{x) for aW x ^ K and hk{x) = f{x) for all x e R" \ L such that 
hk ^ f with respect to the strong topology. Since fk (x) = hk (x) for all a; S AT implies 
hk € A^ir^i^r^ we get that the denseness of A^^^.j. with respect to the weak topology 
implies the denseness with respect to the strong topology. Since the same statement 
is true for the set A^^i^^ we only have to show that both sets are dense in Cy^/(R",M). 
Now, let us complete the proof for the function ^. Consider the vector space 
y = R X R" X Sym(n) , endowed with the standard scalar-product ( . ) : F x F — >• R, 

((a, a, A), {13, b, B)) = a/S + a^b + tr{AB) 

and the map 

ip-.W'^V, xi^ {l,x,xx'^). 

Since a quadrie hypersurface Q is defined via 

Qa,a,A = {x e R" I x'^Ax + a^x + a = 0} 

for an arbitrary element (a, a, A) £ V \ {0} we get the equivalence 

X e QaM,A <^ {{a,a,A),ip{x))=0. 

Hence, using the condition that k = ^{n + l){n + 2) elements of the sequence {xi}^^^ 
do not lie on a quadrie hypersurface and that g{xi) ^ for those elements, we get 

span{g(xi)(y9(a;i),...,g(xAr)(p(xjv)} = V. 

and a short calculation shows that also 

apa.n{g{xi)(p{Axi +t),...,g{xN)ipiAxN +t)} = V. 

is valid for each (A, t) G G. Hence, the linear map generated by the matrix 

M = {g{xi)if{Axi +t),...,g{xN)'p{AxN + t)) 

is surjective for each {A, t) € G. 

Now, consider formula (3.23) as a linear map of the form 

iJAxi+tf:---JAx,,+tf) '^ (a,^,7,e,'5)- (7.1) 
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Let nil, . . . , rrik denote the rows of il/, i.e. AI^ ~ {it^J , ■ ■ ■ , "^^ )• Then, the rrii, . . . , rrik 
can be used to buUt the rows of the representation matrix of (7.1) in the foUowing 
way: 

df df 

a^ = rflx■ (^(^a;i + t), . . . , ^(^-^JV + t))^, 

ft,! = ma • {^{Axi +<),..., jT^iAxN + t))^ 



Q'if Qf 

Therefore, the map (7.1) is surjective and with the formulas (3.13) and (3.14) for 
G = SE{n), and (3.17)-(3.21) for G = SA{n) respectively, the map 

0-L,+t/, • ■ • , jL„+t/) ^ fiA,t)^iI)- (7.2) 

is surjective for each chosen {A,t) E Gn/v(0). Moreover, the map (7.2) is transverse 
to every submanifold of J^ (G, M) . 

Now, let / G C'^(M",R) be arbitrarily chosen. To show the denseness of A^^i^r in 
terms of the weak topology, it is enough to show that A<s,^i_r n [/ + P'^^ {n, 1)] is dense. 
Here, P^^ {n, 1) denotes the set of all polynomials M" -^ R with a degree smaller than 
or equal to 27V. In the case of the map 

F°^ : [/ + P^^in, 1)] X G ^ J\G, R) 
{f+p,A,t)^jf^^,^i>{f+p) 

we already showed the surjectivity if {A, t) £ G is fixed. Hence, using the Parametric 
Transversality Theorem we get that F^^{f + p, . ) is transverse to ili for a dense 
subset of P^^{n,l). Therefore, A^^i^r is dense in Gy[/(R",M) which completes the 
proof for the cost function VP. 

Let us now consider the cost function $. Following the argumentation above, it 
is enough to prove that Aq,^i,r is dense in Gj^(M", M). Due to the conditions for the 
template image g, the set of functions 



^2 

g{x){Ax + t)k{Ax + t)i) 



dxidx 



i,j,kj = l,...,n, j < i, I i^ k 



is linear independent. Otherwise, g would fuUfil a partial differential equation, and, 
since g has compact support, this would imply that g = 0, which contradicts the 
requirements for g. To simplify the notation, we take an arbitrary order of "H^.t and 
write hi{x) for its elements, i = 1, . . . ,fc, k := jn{n + 1)(6 + 3n + m?). Now, let 
Q C K" be a cube containing supp(,g). Then, define a set of orthonormal polyno- 
mials {bj(x)) pj^ with respect to the La-norm on Q, which is in ascend order with 
respect to the degree. Moreover, define Pm '■= {bi{x), . . . , hm{x)) . Thus, the best L2 
approximation of hi{x) € T-LA.t with polynomials up to a certain degree is given by 
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with 

CLi j ~ I bj{x)hi{x)dx. 
Now, consider the matrix 



M 



ai.i • • • ai 



''k.i 



Since T-La.i is hnear independent, there exists a m G N such that all rows of M are also 
linear independent. Due to the fact that ( 'Haa ) = ( "H^ i ) for all (A,t), (A,i) G G, 
this number m is independent of the choise of (A,t). Since M is the representation 
matrix of the map 

n„^K^ /^(a,/3,7,J,e), (7.3) 

we get the surjectivity of (7.3) for each [A, t) ^ G, ii m is big enough. Hence, using 
the formulas (3.13) and (3.14) for G = SE{n), and (3.17)-(3.21) for G = SA{n) 
respectively, one can easily verify that j?^ t)^\-p i^ surjective for each chosen {A, t) S 

Gn/w(o). 

Now, let / S C'^(R", K) be arbitrarily chosen. Like in the case of the cost function 
\1/ before, we apply the Parametric Transversality Theorem to the map 

F^^ ■.[f + V.,n]^G^j''{G,W) 
if+p,A,t)^jf^^,^^f+p). 



Since F^^{ . , A,i) is surjective for ah {A,t) e Gn AV(0), the map F^^{f +p, . ) is 

73,(M",R) 



transverse to il, for a dense subset of Vm- Therefore, A$ i r is dense in Cy 



which completes the proof for the cost function $. D 

Finally, we want to note that the condition b) of Theorem 7.4 is not very restric- 
tive. Since the Region of Interest Q is typically bounded, there exists a minimum 
value nig € M of g. Hence, we can consider the registration-problem 



min / {g{x) - f{Ax + t)Ydx 

(Ad)£GjQ 

with g{x) ~ g{x) + rUg + 1 and /(x) = f{x) + nig -t- 1 instead of (1.1). For that new 
registration problem we have g{x) ^ for all x € Q. 

With this result we can guarantee with a probability of one that the QMC- 
algorithm localy converges quadratically to a local maximum in the case of non- 
artificial generated images. However, Theorem 7.4 will not give any result for the 
idealistic case, in which the reference and template image are identical. Moreover, 
the case in which both images are elements of a spline function space, or smoothed 
by a Gaussian kernel, is not applicable to this theorem. Even though these restric- 
tions are more relevant for applications, we believe that a further discussion in this 
direction is beyond the scope of this paper. 
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